A Fitness Distance Correlation Measure for Evolutionary Trees
نویسندگان
چکیده
Phylogenetics is concerned with inferring the genealogical relationships between a group of organisms (or taxa), and this relationship is usually expressed as an evolutionary tree. However, inferring the phylogenetic tree is not a trivial task since it is impossible to know the true evolutionary history for a set of organisms. As a result, most phylogenetic analyses rely on effective heuristics for obtaining accurate trees. These heuristics use tree score as a basis for establishing an accurate depiction of evolutionary tree relationships. Relatively little work has been done to analyze the relationship between improving tree scores (fitness) and topological accuracy (distance). In this paper, we present a new fitness-distance correlation coefficient called rFD to quantify the relationship between evolutionary trees. By applying this measure to three biological datasets consisting of 44, 60, and 174 taxa, our results show that improvements in fitness are strongly correlated (rFD > 0.8) with topological accuracy to the best-tree-overall. Moreover, we investigated the use of the rFD coefficient if the best overall tree is not available and found similar results. Thus, our results show that rFD is a robust measure with several potential applications such as the development of stopping criteria for phylogenetic search.
منابع مشابه
An Empirical Investigation of How Degree Neutrality Affects GP Search
Over the last years, neutrality has inspired many researchers in the area of Evolutionary Computation (EC) systems in the hope that it can aid evolution. However, there are contradictory results on the effects of neutrality in evolutionary search. The aim of this paper is to understand how neutrality named in this paper degree neutrality affects GP search. For analysis purposes, we use a well-d...
متن کاملGenotype-Fitness Correlation Analysis for Evolutionary Design of Self-assembly Wang Tiles
In a previous work we have reported on the evolutionary design optimisation of self-assembling Wang tiles. Apart from the achieved findings [11], nothing has been yet said about the effectiveness by which individuals were evaluated. In particular when the mapping from genotype to phenotype and from this to fitness is an intricate relationship. In this paper we aim to report whether our genetic ...
متن کاملAlgorithms for Computing the Quartet Distance
Evolutionary (Phylogenetic) trees are constructs of the biological and medical sciences, their purpose is to establish the relationship between a set of species (phyla). Often it is the case that the true evolutionary tree is unknown and one can only try to estimate it. Reconstruction methods are manifold and the resulting evolutionary trees are not guaranteed to be correct. In order to establi...
متن کاملFitness Distance Correlation And Problem Difficulty For Genetic Programming
This work is a first step in the attempt to verify whether (and in which cases) fitness distance correlation can be a good tool for classifying problems on the basis of their difficulty for genetic programming. By analogy with the studies that have already been done on genetic algorithms, we define some notions of distance between genotypes. Then we choose one of these distances to calculate th...
متن کاملCorrelation analysis and performance evaluation of distance measures for evolutionary neural networks
In a genetic algorithm, the search process maintains multiple solutions and their interactions are important to accelerate the evolution. If the pool of solutions is dominated by the single fittest individual in the early generation, there is a risk of premature convergence losing exploration capability. It is necessary to consider not only the fitness of solutions but also the similarity to ot...
متن کامل